An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation

نویسندگان

  • Yoong Keok Lee
  • Hwee Tou Ng
چکیده

In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bayes, AdaBoost, and decision tree algorithms. We present empirical results showing the relative contribution of the component knowledge sources and the different learning algorithms. In particular, using all of these knowledge sources and SVM (i.e., a single learning algorithm) achieves accuracy higher than the best official scores on both SENSEVAL-2 and SENSEVAL-1 test data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Sense Disambiguation using Optimised Combinations of Knowledge Sources

Word sense disambiguation algorithms, with few exceptions, have made use of only one lexical knowledge source. We describe a system which performs unrestricted word sense disambiguation (on all content words in free text) by combining different knowledge sources: semantic preferences, dictionary definitions and subject/domain codes along with part-of-speech tags. The usefulness of these sources...

متن کامل

Psycholinguistics, Lexicography, and Word Sense Disambiguation

Mainstream word sense disambiguation systems have relied mostly on supervised approaches. Complex interactions have been observed between learning algorithms and knowledge sources, but the factors underlying such phenomena are underexplored. This calls for more qualitative analysis of disambiguation results, possibly from an inter-disciplinary perspective. The current study thus preliminarily e...

متن کامل

Knowledge sources for disambiguating highly ambiguous verbs in machine translation

Word sense disambiguation (WSD) is one of the most challenging outstanding problems in the current machine translation systems. An effective proposal in this context will rely on the use relevant knowledge sources. Moreover, it must perform better than the current traditional approaches. We present some experiments with machine learning algorithms traditionally applied to WSD, aiming to discove...

متن کامل

Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation

We present a multilingual joint approach to Word Sense Disambiguation (WSD). Our method exploits BabelNet, a very large multilingual knowledge base, to perform graphbased WSD across different languages, and brings together empirical evidence from these languages using ensemble methods. The results show that, thanks to complementing wide-coverage multilingual lexical knowledge with robust graph-...

متن کامل

Survey of Word Sense Disambiguation Approaches

Word Sense Disambiguation (WSD) is an important but challenging technique in the area of natural language processing (NLP). Hundreds of WSD algorithms and systems are available, but less work has been done in regard to choosing the optimal WSD algorithms. This paper summarizes the various knowledge sources used for WSD and classifies existing WSD algorithms according to their techniques. The ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002